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To compare the use of randomized controls (RCTs) and historical 
controls (HCTs) for clinical trials, we searched the literature for 
therapies studied by both methods. We found six therapies for which 
50 RCTs and 56 HCTs were reported. Forty-four of 56 HCTs (79 
percent) found the therapy better than the control regimen, but only 
10 of 50 RCTs (20 percent) agreed. For each therapy, the treated 
patients in RCTs and HCTs had similar outcomes. The difference 
between RCTs and HCTs of the same therapy was largely due to 
differences in outcome for the control groups, with the HCT control 
patients generally doing worse than the RCT control groups. Ad- 
justment of the outcomes of the HCTs for prognostic factors, when 
possible, did not appreciably change the results. The data suggest 
that biases in patient selection may irretrievably weight the outcome 
of HCTs in favor of new therapies. RCTs may miss clinically im- 
portant benefits because of inadequate attention to sample size. The 
predictive value of each might be improved by reconsidering the use 
of p <0.05 as the significance level for all types of clinical trials, and 
by the use of confidence Intervals around estimates of treatment 
effects. 
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Since James Lind's experiments on the treatment of scurvy in 1747, 
the controlled clinical trial has been increasingly recognized as the 
best method of establishing the value of new therapies. The number 
of controlled trials published has grown in recent years [ 1 ] , and they 
have become increasingly sophisticated in terms of design, man- 
agement and analysis. A comparatively new development has been 
the use of randomization in the assignment of patients to treatment 
and control groups. Randomization has a number of practical and 
theoretical advantages including reduction of bias in assignment of 
patients to treatment groups, the opportunity for blinding of patient 
and physician as to treatment and the provision of a setting in which 
the assumptions underlying statistical tests are more closely ap- 
proximated. If the investigators so choose, RCTs can be stratified to 
ensure that known prognostic factors are equally divided between 
treatment and control groups. The investigators depend on the ran- 
domization process to produce a reasonable division of unknown or 
unstratified risk factors, and this is generally (but certainly not always) 
the result. 

Acceptance of the RCT is growing but still far from universal, and 
the majority of published clinical trials do not use this method [1,2]. 
Recent articles have argued that RCTs are impractical in surgery [3,4] 
and often unnecessary in cancer medicine [5]. Double-blind RCTs 
have been criticized on ethical and practical grounds, and it is claimed 
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TABLE I Conclusions of RCTs and HCTs on Six Therapeutic Questions 
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that comparison with recently treated patients with the 
same disorder (HCTs) can evaluate new therapies more 
rapidly without exposing patients to possibly ineffective 
therapies and uncertainties [6]. Proponents of HCTs 
claim that adjustments can be made for differences in 
known prognostic factors between the two groups, 
leaving a valid estimate of the effect of the treatment. 
These issues have been debated in the literature and 
were a major topic at a recent international symposium 
[7]. The present study attempts to provide data for a 
rational comparison by looking at therapies studied by 
both methods. 

MATERIALS AND METHODS 

Since 1955, one of the authors (TCC) has maintained a file 
of RCTs published in English. Articles for this file are gathered 
by computer and manual literature searches of Index Medicus 
on specific topics of interest, by weekly review of Current 
Contents and by checking references of reviews and papers 
already in the file. At the same time, HCTs and uncontrolled 
trials are also filed. RCTs were defined as trials in which both 
treatment and control groups were gathered prospectively 
and randomly assigned. Trials in which prospectively col- 
lected treatment groups were compared with either previ- 
ously published series or previously treated patients at the 
same institutions were considered HCTs if the authors drew 
conclusions about relative efficacy from these comparisons. 
HCTs were further subdivided into those that simply compared 
over-all outcome and those that matched or adjusted outcome 
rates on the basis of prognostic categories, or provided 
sufficient data so that some such adjustments could be made 
by the reader. Therapies were considered for inclusion in the 
present study if at least two RCTs and two HCTs were found 
for the same therapy. When published reports gave results 
by prognostic categories as well as treatment, these data 
were used to produce adjusted survival, or other outcome, 
rates. For each paper, listed prognostic factors were 
weighted equally to produce an equivalent average rate [8]. 



Studies were considered positive (i.e., they found the new 
therapy to be effective) if a statistically significant benefit in 
outcome was found for treatment over control regimen, or, 
when no statistical analysis was presented, if the authors 
concluded that the therapy was superior to the control regi- 
men. Trials that did not meet these criteria were considered 
negative. When multiple outcomes were studied, the most 
serious (e.g., death) was used. 

RESULTS 

One hundred six papers on six therapeutic questions 
met our criteria for inclusion in the study. Over-all, 10 
of 50 RCTs (20 percent) found a benefit from the ther- 
apy studied, while 44 of 56 HCTs (79 percent) on the 
same questions concluded that the therapies were 
beneficial (Table I). Twenty-nine of the 50 RCTs (58 
percent) gave the probability that the difference found 
could be due to chance (a) or provided sufficient data 
so that the probability could be calculated; in seven (14 
percent), this probability was less than 0.05. Twenty-six 
of the HCTs (46 percent) provided probability values or 
data for estimating a; in 22, it was less than 0.05 (Table 
II). Most of the papers that found a probability greater 
than 0.05 did not give the actual value. Thirty-one pa- 
pers (three RCTs and 28 HCTs) presented neither sta- 
tistics nor sufficient data to calculate them. 
Cirrhosis with Esophageal Varices. Twenty RCTs 
[9-28] and 18 HCTs [29-46] were found on treatment 
of cirrhotic patients with esophageal varices. Table I 
shows that six of 20 RCTs (30 percent) found a benefit 
from the therapy studied compared with 12 of 19 HCTs 
(63 percent). Two of the three HCTs that attempted to 
adjust or match treatment and control groups for 
prognostic factors (including age, sex, severity, con- 
current diseases, etc.) found a benefit. 

Ten RCTs and seven HCTs gave survival data, either 
in-hospital or long-term, that could be analyzed. The 
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TABLE II Levels of Significance (a) Reported in 106 Clinical Trials 
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TABLE III 


Early Survival in the Treatment of Varices 
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average early survival rates for the treatment groups 
in the two types of studies were 81 percent and 66 
percent, respectively (Table III). However, the early 
survival of the control groups was 74 percent in the 
RCTs and 33 percent in HCTs, and the apparent effect 
of the treatment increased from an average of 7 percent 
in the RCTs to 33 percent in the HCTs. The long-term 
survival data were similar. Thirteen trials compared 
surgery (mostly portosystemic shunt procedures) with 
medical management (Figure 1). The pooled five-year 
survival curves for the RCTs show little difference be- 
tween medical and surgical therapy. The survival of 
patients who underwent shunt procedures in HCTs is 
slightly better than both groups in the RCTs, but the 
survival of the control subjects in the HCTs falls rapidly 
and remains much lower throughout the five years. 
Coronary Artery Surgery. Eight RCTs [47-54] and 
21 HCTs [55-75] (12 of which adjusted for or gave data 
on prognostic factors) were found on the surgical 
treatment of coronary artery disease. Only one of the 
RCTs found a significant benefit in over-all survival 
between surgically and medically treated groups (al- 
though certain subgroups showed a benefit from op- 
eration), but nearly all of the HCTs found a benefit from 
surgical treatment. A comparison of long-term survival 



TABLE IV Pooled Survival in Clinical Trials of Medical versus Surgical Treatment of Coronary 
Artery Disease 
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* Adjusted to have the same proportion of patients with one-, two- and three-vessel disease as in the RCTs. 
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Figure 1. Survival of treated and control groups in clinical 
trials of shunt surgery for cirrhosis with esophageal 
varices. 



in the six RCTs and nine HCTs that gave such data is 
shown in Table IV. The pooled HCTs show both a higher 
survival for surgical patients and lower survival for 
medical patients. When the HCT data are adjusted to 
have the same over-all proportion of patients with one-, 
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Anticoagulants for Acute Myocardial Infarction 
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DES for Habitual Abortion 
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two- and three-vessel disease as the RCTs, the differ- 
ence in survival between medical and surgical groups 
is decreased, but remains larger than the difference in 
the RCTs. (Only six of the HCTs provided data on pro- 
portions of one-, two- and three-vessel disease.) 
Anticoagulants for Acute Myocardial Infarction. Six 
RCTs [76-81] and six HCTs [82-87] of this treatment 
have been previously reviewed [88] (using a different 
definition of HCT), and the combined data suggested a 
beneficial effect. Since then, four additional RCTs have 
been found [89-92] (Table V). Nine of the 10 RCTs 
found no significant benefit from anticoagulants, al- 
though all but one showed a trend toward better survival 
in the treated patients. The pooled unadjusted results 
from six HCTs showed poorer survival in both treated 
and control groups, but the difference between treated 
and control groups in HCTs was four times as large as 
the difference between treated and control groups in 
RCTs. Four of the six HCTs published survival data that 
allowed adjustment for patient's age, sex, history of 
previous infarction, location and severity of infarction 
and presence of other diseases (some studies did not 
give data on all these variables). When the survival rates 
were adjusted for these variables, the difference be- 
tween treatment and control groups decreased from 
17.1 percent to 11.0 percent, but this was still more 
than twice the difference for the RCTs. 
5-Fluorouracil Adjuvant Therapy for Colon Cancer. 
Five RCTs [93-96] and two HCTs [97,98] of the effect 
of adjuvant 5-fluorouracil (5-FU) in patients with colon 
cancer undergoing surgery were found. Since some of 
the papers gave results only in terms of mortality and 
some gave only disease-free survival, they cannot be 
combined, but the pattern is similar. The RCTs found a 



slight trend favoring the use of 5-FU, but no significant 
differences in over-all survival, whereas both HCTs 
showed a much larger difference in favor of 5-FU. When 
the HCT results were adjusted for age, sex, hospital, 
surgeon, location of tumor, stage, histologic findings 
and presence of leukopenia, the differences between 
treatment and control groups were changed only 
slightly. Two studies of intraluminal 5-FU were found, 
with several of the same investigators participating in 
both studies. The first study was an HCT that found a 
benefit; however, when an RCT was done, there were 
almost identical survival curves for treatment and 
control groups. 

BCG Adjuvant Immunotherapy. Four RCTs [99-102] 
and four HCTs [103-106] looked at adjuvant therapy 
with BCG in patients with malignant melanoma. The 
papers did not provide sufficient data to adjust for 
prognostic factors. Also, since some reported survival 
from the time of operation, some from time of recur- 
rence, and some did not specify, the results could not 
be directly compared. All four of the HCTs reported 
significantly increased survival for the treated patients, 
while only two of the four RCTs found a benefit. 
Diethylstilbestrol for Habitual Abortion. The eight 
studies found on this question (all published before 
1955) included three RCTs [107-109], four unmatched 
HCTs [1 10-1 13] and one HCT that matched patients 
for age and previous history [1 14]. The results of these 
studies, in terms of percentage of live infants, are shown 
in Table VI. The RCTs found essentially no difference 
in outcome, whether or not diethylstilbestrol (DES) was 
given. The four unmatched HCTs had a very similar 
success rate in the treated patients, but the HCT control 
group did notably worse. In the one matched HCT, both 
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groups had poorer outcomes, but a large difference 
between treatment and control groups was again 
found. 

COMMENTS 

In each of the six areas examined, the results of clinical 
trials were more dependent on the method of selection 
of control groups than on the therapy under study. For 
each of the questions, HCTs were much more likely 
than RCTs to find a difference, despite similar outcomes 
for the treated patients in the two types of study. The 
differences were in the outcomes for the control groups, 
and, in general, the control groups in the HCTs had 
notably poorer outcomes than those in the RCTs. 

In nearly always finding the treatment better than the 
control regimen, the HCTs we examined would rarely 
come to false-negative conclusions, but this may be at 
the expense of many false-positives. The RCTs have 
the opposite fault: by declaring most therapies no better 
than the control regimens, they would rarely come to 
false-positive conclusions but possibly create many 
false-negatives. The clinical trials examined are a small 
proportion of all published trials and were not chosen 
at random. The authors of some of the papers attached 
so many qualifications to their conclusions that other 
readers might not agree with our classification into 
positive and negative. Nevertheless, the papers include 
several specialty areas, both single-center and coop- 
erative trials and span four decades. Thus, we think it 
is very probable that our findings are applicable to trials 
published on other therapeutic questions. It might be 
that negative RCT results are more likely to be published 
than negative HCT results. New therapies may be tried 
on small numbers of patients, with poorer results than 
standard therapy, and the results never submitted or 
published. Practicing physicians, however, generally 
have little access to unpublished data and must rely on 
published results to decide how to treat their pa- 
tients. 

Can the accuracy of HCTs be increased? We fear 
there is little room for improvement in this area. HCTs 
using literature controls have difficulty distinguishing 
treatment effects from differences in ancillary care, 
diagnostic criteria, referral patterns or trends over time. 
HCT control groups generally include all patients seen 
who meet the diagnostic criteria for the disease under 
study. Criteria for inclusion in the treatment group are 
usually more stringent. The treatment group may be 
consciously or unconsciously narrowed to include only 
those patients the investigator feels are most likely to 
benefit from the therapy, and the patients are en- 
thusiastically recruited. Poor-risk patients may not be 
offered the treatment or offered it so unenthusiastically 
that they decline to participate. Even if selected for the 



treatment, patients may not be selected for the report 
of the treatment. Block et al. [115] have shown that 
uncontrolled trials of cancer therapy reported a higher 
proportion of patients listed as "nonevaluable" than 
controlled trials. 

The data presented suggest that such biases in pa- 
tient selection may irretrievably bias the outcome of the 
HCT. It has been claimed that retrospective adjustment 
for prognostic factors can be used to produce an esti- 
mate of the effect of the treatment alone, but the studies 
we reviewed with such adjustments (either by the 
original authors or by us) showed nearly the same 
treatment effect as unadjusted studies. These adjust- 
ments were relatively crude and do not take into ac- 
count possible interactions between prognostic factors. 
Recently, more sophisticated step-wise multiple re- 
gression procedures have been advocated [1 16], but 
there is as yet little evidence to suggest such proce- 
dures can better recognize ineffective therapies. 

The accuracy of RCTs, on the other hand, could be 
improved by greater attention to sample size in planning 
studies. A recent review of 71 "negative" RCTs [117] 
found that a potential 25 percent improvement could 
have been missed in 57, and a potential 50 percent 
improvement in 34. At the planning stage of a trial, 
consideration of the size of the benefit sought and the 
number of patients needed to demonstrate it can keep 
the possibility of this sort of type II error at acceptable 
levels, but with increases in the cost and duration of the 
study. 

A possible solution is to reconsider the nearly auto- 
matic use of a p value of less than 0.05 as the critical 
point at which a difference is felt to be statistically 
significant. Perhaps well-designed and well-blinded 
RCTs with little chance for bias should be considered 
positive when a is less than 0.10 or 0.20. This would 
increase the proportion of positive trials and save time 
and money. On the other hand, our data suggest that the 
opportunity for bias is so large in HCTs that when a is 
less than 0.01 or even 0.001 , the therapy still may not 
be effective. The decision about what significance level 
to accept should also take into account other factors, 
including the prevalence of the disease, the medical and 
economic costs of the disease and of the therapy and 
the best pretrial estimate of the likelihood that the new 
therapy represents an advance. 

It is also important to use the results of a trial to es- 
timate the size of the difference between treatment and 
control groups, and to construct confidence intervals 
around this estimate [118]. This replaces the often 
arbitrary decision of whether outcome of the new 
treatment is significantly different from that of the old 
with the best single estimate of the size of the difference 
and the range in which the true difference most likely 
falls. 
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